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ABSTRACT 

We present a stable procedure for defining and measuring the two point angular autocorrela- 
tion function, wlO) = (6 / 9o{V))^^ , of faint (25 < < 29), barely resolved and unresolved 
sources in the HST GOODS and UDF datasets. We construct catalogs that include close pairs 
and faint detections. We show, for the first time, that, on subarcsecond scales, the correla- 
tion function exceeds unity. This correlation function is well fit by a power law with index 
r « 2.5 and a 6*0 = lO^O-^f^^^s.s) arcsec. This is very different from the values of F « 0.7 
and 6*0(7") = io^'^ '*('"^2i.5) gLj.f;gg(. associated with the gravitational clustering of brighter 
galaxies. This observed clustering probably reflects the presence of giant star-forming regions 
within galactic-scale potential wells. Its measurement enables a new approach to measuring 
the redshift distribution of the faintest sources in the sky. 
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1 INTRODUCTION 

The two-point angular and spatial autocorrelation functions, w{9) 
and ^(r) are defined as: 
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and 

The correlation function of galaxies has been studied obser- 
vationally at least as far back as Totsuji & Kihara (1969). Most 
recently, the large scale angular correlation function, w{8), func- 
tion was measured for the galaxies in the Sloan Digital Sky Sur- 
vey (SDSS) (Connolly et al. 2002) for sources with magnitude 
18 < r < 22 and on scales 10" < 6 < 1000". This and similar 
work with the 2 Degree Galaxy Redshift Survey (2DF) (Hawkins 
et al. 2003) confirmed that w{6) is consistent with a power law 
form (see Fig. 1): 
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Figure 1. The SDSS correlation function with different limiting magni- 
tudes. 



w{e,r) = 

where: 

0o{r) = 10~ 
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00 is roughly proportional to the flux of the source. If we were 
to extrapolate this trend out to r < 25, we would expect a 60 
0.054", too small to observe with current data. 

Both the SDSS and 2DF groups measured the spatial correla- 
tion function (Zehavi et al. 2002) (Hawkins et al. 2003). SDSS ob- 



served r < 22.5, 0.02 < z < 0.13 sources on scales of 0.14 Mpc 
< r < 23 Mpc. These observations essentially confirm a power 
law model of the correlation function with exponent 7 ~ F + 1, 
consistent with Limber (1953): 
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where rc is the characteristic distance to the sources and the sources 
are distributed with a width Ar^ « r„. 
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Masjedi et al. (2006) extended the SDSS measurement of ^(r) 
down to 14 kpc with moderate resolution difficulties on the smallest 
scales and found that ^(r) is consistent with a F = 2 power law 
over four orders of magnitude. 

Peebles (1974) proposed that the cosmological scale corre- 
lation function and its power spectrum counterpart should be re- 
lated to the microwave background fluctuation using the theory of 
gravitational perturbations. On cosmological scales, the dark mat- 
ter correlation function evolved from primordial mass fluctuations. 
Halo Occupation Distribution (HOD) frameworks are used to pre- 
dict galaxy bias within large dark matter halos (Peacock & Smith 
2000). Today, theory and experiment are in excellent agreement 
on the largest scales e.g. Tegmark et al. (2004), and measurements 
down to 0.3 Mpc, including slight perturbations from a power law, 
can be explained in the HOD framework (Zehavi et al. 2004). The 
continuation of a power law down to smaller scales is less under- 
stood, but Masjedi et al. (2006) note that it could be accommodated 
by HOD models with reasonable modifications. 

Measurements of ^(r) are fundamentally restricted to bright 
sources by the need for rcdshifts. HST and large telescopes make 
spectroscopic redshift measurements good for r < 25 and pho- 
tometric measurements for r < 27 (Coe et al. 2006). But the 
faintest photometric redshifts cannot be calibrated. EUipticity mea- 
surements are very uncertain for sources which are not significantly 
larger than the PSF, which hinders HST ellipticity measurements 
for sources dimmer than r w 26. But even for the faintest sources, 
we can precisely measure position and, with a large enough sample, 

w{e). 

The 25 < y < 29 sources we study in the paper are only 
0.1" — 0.5" in size, no more than a few kpc across at any redshift 
and much smaller than local galaxies. We find that these sources are 
only significantly clustered on subarcsecond scales. Even at high 
redshifts, the physical correlation scales would be roughly 5 kpc 
and smaller, much smaller than the scales probed by Masjedi et al. 
(2006). These sources tend to be bluer than the luminous red galax- 
ies (LRG)s selected by the SDSS groups, as they were selected in 
the V band. Blue sources with sub-galactic luminosities and sizes 
separated by sub-galactic distances are likely to be separate star- 
forming regions in the same dark matter potential wells. The cos- 
mological effect which causes the correlation function observed in 
SDSS might have some influence at such small scales, but effects 
other than gravity - gas dynamics, star formation, supemovae and 
so on- dominate over any simple halo modeling. In addition, the 
faint source correlation function (FSCF) traces luminosity, but its 
relationship to mass is unclear. 

Measuring w(6) for faint sources on small scales is an im- 
portant method of probing how these primarily non-gravitational 
effects augment the gravitational correlation function. With much 
larger datasets, we could study the transition from gravitational to 
non-gravitational domination in the correlation function. Despite 
the fact that this is not possible with current data, the FSCF is in- 
teresting in its own right as a useful time-dependent tracer of star 
formation and galactic structure, 

The small, faint sources we study are an important astrophys- 
ical mystery. If we extrapolate the source counts from the Hubble 
Ultra Deep Field (UDF) to the whole sky, we estimate that there 
are « lO^'^ sources in the sky, and yet if we extend our local galaxy 
density out to the w 10^^ Mpc^ comoving volume within z ^ 4, 
we obtain roughly one tenth this number. These sources are there- 
fore likely to be the subunits of future galaxies and studying their 
redshift distribution would be a useful way to probe galaxy assem- 
bly. Unfortunately, we do not know the distance to these sources. 
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Figure 2. The faint correlation function in COSMIC (Brainerd et al. 1995) 
and the HDF (Villumsen et al. 1997). 



Previous estimates have ranged from z <1 (Babul & Rees 1992) to 
z w 2.5 (He et al. 2000). Their dimness and small size make them 
difficult to study photometrically or geometrically, but we can study 
the way they cluster with some precision. 

In this paper, we observe the FSCF in the HST GOODS and 
UDF in the 0.3" < 6 < 10" range. The excellent angular reso- 
lution allows us to make the first statistically significant measure- 
ment of the w{9) for faint sources and to measure both 6o and V. 
In SDSS, Li et al. (2007) found a significant correlation function of 
r < 17.8 sources on scales down to 10 kpc 2" at their limiting 
redshift of z = 0.3. Brainerd et al. (1995) studied the correlation 
function of r < 26 sources down to scales of 30" using the COS- 
MIC imaging spectrograph and showed that they were on the order 
of 0.01 at these large scales. Villumsen et al. (1997) used r < 29 
sources on scales down to 3" in the Hubble Deep Field (HDF) but 
did not find a correlation function larger than 0.2 or more than 2a 
greater than 0. Connolly et al. (1998) used the i < 27 sources to to 
confirm that w{9) was on the scale of 0.1 for arcsecond 6 ~ 1". 
As shown in Fig. 2, the groups studying faint sources found w{9) 
was equal to only a few tenths and barely statistically significant. 

In section 2 of this paper, we discuss the data used to make 
these measurements. In section 3, we explain our computational 
methods for producing simulated images. In section 4, we describe 
the production and characteristics of catalogs used in this analysis. 
We discuss how we produce our estimate of the correlation function 
in section 5. In section 6, we present our best fit models to the data, 
and in section 7 we discuss the astrophysical significance of our 
findings and how they could be applied to future surveys. 

In future papers, we will show that the three-point autocorrela- 
tion function is also measurable for these faint sources. We will pro- 
vide a formalism to measure how the two-point correlation function 
is distorted by a gravitational lens and use it to relate the distribu- 
tions of source and lens redshifts to the extent permitted by existing 
data. Finally, we will measure this effect and use it to relate the dis- 
tributions of source and lens redshifts to the extent permitted by 
existing data. 



2 SAMPLES 

Previous measurements of the FSCF were limited by low resolution 

and poor statistics. To overcome resolution difficulties, we use HST 
observations with roughly 0.12" resolution. To improve upon HDF 
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F435W F606W F775W F85()LP 
GOODS 2X8 2X8 XLl 26^6 
UDF 29.1 29.3 29.2 28.7 

Table 1. The limiting magnitudes for lOcr detections of point sources in 
different bands of GOODS and UDF. 



measurements we use samples which are either larger or deeper to 
increase the number of total sources. We measure the positions of 
the 25 < F < 28 sources in the HST Great Observatories Origins 
Deep Survey (GOODS) North and South (GiavaUsco et al. 2004) 
and make 27 < V < 29 measurements in the HST UDF (Beckwith 
et al. 2006). 

The GOODS fields cover roughly 160 arcmin^ each in the 
ACS BViz bands (F435W, F606W, F814W, and F850LP).The V 
band limiting magnitude for a lOcr detection of a point source 
is 27.8. The standard catalogs made detections in the z band and 
found 29599 sources in the South and 32048 sources in the North. 
Using methods detailed in section 4, we produce catalogs with 
56088 and 60182 sources in the South and North respectively. 

The UDF covers roughly 1 1 arcmin^ in the same bands but 
probes roughly 1.5 m deeper than GOODS. The V band limiting 
magnitude for a lOcr detection of a point source is 29.3. A standard 
catalog was made in the i band with 10,040 sources. Using meth- 
ods described in section 4, we produce V band catalogs with 7298 
sources. 

Studying the angular correlation in HST Cosmic Evolution 
Survey (COSMOS) would be an interesting extension to this work. 
COSMOS covers two square degrees and is complete for 0.5" 
sources down to i = 26 (Scoville et al. 2007). This is roughly 
equivalent to a V = 25 for typical sources. Measuring the FSCF 
in COSMOS would allow us to fill in the gap between the V > 2b 
work here and the r < 22 work in SDSS. Unfortunately, the enor- 
mous size of COSMOS make the simulation techiuques we use here 
impractical, and measuring the COSMOS correlation function will 
have to be a separate effort. 



3 PRODUCING SIMULATED DATA 

In order to measure the FSCF acciu'ately, we must correct for non- 
astrophysical correlation effects like optical resolution limits, the 
incorrect deblending of sources of non-zero size and the clustering 
of noise peaks. Simulated images are our main tool in estimating 
these effects and determining how to make the best catalogs for 
these observations. 

To make simulated data, we generated images with only 
sources, convolved them with a simulated HST ACS PSF and added 
Gaussian noise fields that had been convolved with a separate noise 
correlation PSF. We tested and rejected many parameterizations 
of the source characteristics, source distribution, noise models and 
PSFs. This simulation required that many parameters be fine-tuned 
to match the statistical properties of GOODS and UDF. In the fol- 
lowing descriptions we describe these parameters, the values we 
adopted and the quantitative rationale for these choices. 

3.1 Simulated Source Profiles 

We compared real (25 <V< 27) sources to the best fit de Vau- 
couleurs, Lorentzian, exponential and Gaussian profiles. After con- 
siderable experimentation, we adopted the de Vaucouleurs profile, 
but found that it performed only moderately better than the other 



profiles, because the sources are small and barely above threshold. 
For all fits, the profile is assumed to be elliptical, and wc convert to 
fit to a circular profile using an effective radius, r defined by: 

= ai{x - xof + a2{y - yaf + asix ~ xo){y - j/o) (6) 

with (xo, j/o) being fit to define a centre and (oi , a2,as) being fit 
to give the source an arbitrary ellipticity and orientation. 

In the case of the de Vaucouleurs profile, we taper the sharp 
central peak when < 0.5r^/2 where ri/2 is the half light ra- 
dius. In this central region of a few pixels, we replace with 
0.25(ri/2 + r^) The sharp peak would be flattened by the PSF 
and is in fact not even observed in fully resolved sources (Alam & 
Ryden 2002). 

In addition to the above shape parameters, we also fit an inte- 
gral intensity, Ic, and a simple background with a linearly varying 
intensity so that the function to which we fit surface brightness: 

B{x, y) = 6„ + brx + h2y + /o/(r^) (7) 

where I is the normalized intensity of the profile being tested. 

For each source the data is fit inside of a square with sides of 
length 3VC4 where A is the detection area (a typical source and fit is 
in Fig. 3). We use the standard deviations implied by the weight im- 
ages and compare the over degrees of freedom for each fit type. 
We find that our modified de Vaucouleurs profile has an average 
/Ndof of 1.386 while the Gaussian, Lorentzian and Exponen- 
tial profiles have fit values of 1.396, 1.396 and 1.391 respectively 
so the choice of profile was not critical. We use de Vaucouleurs pro- 
file but note that the difference in fit quality for such faint sources 
is minimal. 

In principle, the variation of profile brightness at large ra- 
dius could influence pair finding. However, for 25 < V < 26 
sources we only search for pairs on scales 9 > 0.8". The average 
de Vaucouleurs intensity for simulated sources at this distance is 
2.3x 10"'*s"^ while the GOODS noise tiireshold is 4.2x 10"^s"\ 
This value is smaller for all dimmer sources at their respective min- 
imum pair distance (see section 6). The influence on pair-finding is 
largest for the de Vaucouleurs profile that we have chosen to use 
and even here it does not greatly affect our final results. Intensities 
of 5% of the threshold will have minimal influence on the detec- 
tion. We plot the intensity of idealized de Vaucouleurs, Lorentzian, 
Exponential and Gaussian profile source with V ~ 25.5 and typ- 
ical detection area in Fig. 4. All other profiles drop off faster than 
the de Vaucouleurs profile on the pertinent distance scales. The tails 
at such scales must be minimally important. 

3.2 Simulated Source Distribution 

Our simulated catalogs are designed to match the magnitude, detec- 
tion area and ellipticity distributions of GOODS and the UDF. We 
produced many simulated images using different input distributions 
until the output catalogs matched the data. 

We start by using a broken exponential magnitude distribution 
in the V band: 

P{V) OC Q{V - Vma.) e"'' (8) 

n = 0.92, for V < 27.5 

0.72, for V > 27.5 (9) 

where V^ax = 29.5 for GOODS and 31 for UDF. 

When determining the width of a source, we scale each source 
so that it can be seen above the background intensity threshold, T. 
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Figure 3. A typical source profile from real GOODS data (left) and the residuals left by a de Vaucouleurs fit (right). 
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Figure 4. The radial profiles of a simulated V ki 25.5 source using the de 
Vaucouleurs, Lorentzian, exponential and Gaussian fits. The total intensity 
within Ot ~ 0.1" is constant. The de Vaucouleurs tail is the largest, but it 
is small at pair separation scales of Omin ^ 8^t- 



In the absence of a PSF or ellipticity our sources would have the de 
Vaucouleurs intensity profiles: 



/(r): 
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where 7o is just the total integrated intensity of the source. 

The term ro determines the width of our sources. We must 
make a series of variable transformations to produce a distribution 
of input ro's that leads to an accurate distribution of detected out- 
put widths. We start by noting that for an appropriately bright and 
small source, there is some rr such that lifr) ~ T. The size of the 
source that we are interested in is the area that exceeds the thresh- 
old, TT fy. We define a normalized area above the threshold. At, 
and a normalized area Ao oc Tq. Relating the two tells of how to 
relate the input source width to the detected source area: 
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At = when = 1, because sources more diffuse than this 
have central luminosities below threshold. We only make sources 
with < ^0 < 1. This function has sharp, undesirable behavior 
near its minimum at = e~*. Using the variable transformation 
a — log(^()) we can make a smoother, manageable function which 
we Taylor expand around the maximum at a = — 8. 

AT = a'e-^e-'8'{l-^{a + 8f), (14) 

which we can invert to determine a range on the parameter a: 



ar = — 8 ± 



16 - ^At 

g7 



(15) 



At least 10 pixels must be above threshold or vrr^ > 10. This 



yields a range on a for which At > 10^ : 
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We ignore sources with lo/T < 72 as they are below our de- 
tection threshold. After much experimentation, we find that we can 

reproduce the actual area distribution of sources best if we select 
alpha from a uniform distribution in the bounds: 
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(17) 



Finally, we make our sources ellipses with random orienta- 
tions. The intrinsic ellipticity is chosen as uniform between and 
0.9 which reproduces the ellipticity distribution after processing. 
The position of each source is uniform and random except for the 
'partner' sources described in subsection 3.7. 

3.3 PSF Simulation 

After producing idealized sources we must use an accurate PSF so 
that small sources are properly blurred and small angle correlations 
resemble those of the actual image. We cannot use a separate PSF 
for the roughly 120,000 sources in each of several hundred simu- 
lations, so we use a single PSF over our entire field and convolve 
the simulated image using the FFTW algorithm (Frigo & Johnson 
2005). 
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The PSF is constructed using the 'Tiny Tiny' program de- 
signed to simulate HST PSFs (Krist 1995). To simulate an aver- 
age galaxy in the GOODS fields, we assume a power law source 
spectrum with spectral index n = —0.1. We average together 100 
PSFs at random positions on each of the two ACS chips and in- 
clude 0.007" of jitter. We apply the electron diffusion PSF in the 
Tiny Tim documentation taking care to modify the pixel size to 
0.03". The final PSF has a fitted width that is roughly 0.98 times 
that of the average fitted width of a random sample of 10 point-Uke 
sources. 



3.4 Simulated Noise 

In all simulated images, we use background noise that approxi- 
mates the small scale structure of our image and matches the gross 
statistical properties of the noise in our data. The noise is a field of 
Gaussian random numbers with variance proportional to the inverse 
of a weight image at each pixel. To simulate drizzling and cosmic 
correlation, we convolve this noise with a modified top hat function 
with bin values proportional to the fraction of their area that a 0.98 
pixel radius circle would fill. We scale the standard deviation of 
the Gaussian field so that we match the roughly 0.0025sec~^ rms 
calculated by extractor in GOODS. Our RMS in UDF is roughly 
0.0007sec"\ 

With these constraints, the number of counts in the simulated 
negative image (image multiplied by — 1 so that sources are ignored 
by Oxtractor) is equal to counts in the negative image when we 
lower the detection threshold to 1.4cr (to increase counts to roughly 
200). Zodiacal light, sunlight scattered off of dust, is the largest 
background for HST observations (Bernstein et al. 2002), and our 
background is roughly consistent with a constant zodiacal glow. 



3.5 Comparison of Simulated Image Catalogs with Data 

Catalogs 

The above procedures produced catalogs with similar distributions 
in V magnitude, detection area and ellipticity as those in the ac- 
tual GOODS and UDF catalogs as shown in Fig. 5. The number 
of sources within any magnitude band was within a few percent of 
the observed value, and that the area and ellipticity distributions are 
similarly close to those observed in the GOODS field. 



3.6 False Detections in Simulated Images 

The detection algorithm in section 4 was guided by our study of 
false detections in simulated images. To find false sources, we com- 
pared the positions of detected sources to those of actual sources in 
our input image. We made a catalog of detected sources that were 
more than 0.3" (ten pixels) away from actual sources (as defined 
by an input catalog of sources). We never include pairs closer than 
this in our correlation function calculation. These potential detec- 
tion areas cover a total of around 10 percent of the detection area. 
Increasing the distance at which we are willing to associate a de- 
tection with an input source beyond 0.3" decreased the number of 
false detections at a rate consistent with the decrease in area that 
was "far from a source". This indicates that false detections are not 
strongly clustered around sources on scale greater than 0.3". We 
find roughly 270 false sources in GOODS South, 0.5% percent of 
our total sources. This fraction is roughly constant across magni- 
tude. 



3.7 Simulated Clustering 

In order to evaluate our ability to measure an intrinsic correlation 

function, we must see how well we measure the correlation func- 
tion in simulated clustered datasets. To make these datasets we start 
with an unclustered data set and assign each source Up partners 
where the distribution of rip is: 

e-np/no /•">" / g\ -2-5 

Pinp) = ; no = po / ^77 ^^^d9 (18) 

"0 J 0.2" J 

where 6l(, = 0.432" (0.27") for GOODS (UDF). 

These clustered sources increases the total number of sources 
by 50% (33%). Each extra source is assigned a separation angle, 
0, from its parent with a distribution: 

P{e) oc 6i-'-^ (19) 

Between a minimum 6 of 0.2" (0.1") in GOODS (UDF) and a 
maximum 6 of 10" .We pick a uniformly distributed random posi- 
tion angle 4>. 

This method produces power law distributed clumps but be- 
cause of clump-clump correlations does not produce perfect power 
law behavior. Nor does it produce an exact match to the observed 
correlation function. We use these simulations only to study how 
well our measurement algorithm recovers an intrinsic correlation. 



4 CATALOG PRODUCTION 

Attempts to measure w(^) and ^(r) in bright galaxy surveys are 
rarely confused about what is being counted. Bright r < 22 galax- 
ies are physically distinct 'island universes', and although they are 
observed to collide and merge, the autocorrelation statistics are not 
seriously hindered by decisions about whether or not to count a 
comparatively rare interacting pair as one galaxy or two. However, 
when considering the FSCF in our sample, we quickly realize that 
6o is only a few times larger than the physical size of the sources 
and the resolution of the observations. This implies that we must be 
scrupulous in defining sources and consistently use the same defi- 
nition when comparing with simulations of galaxy formation. 

4.1 Source Extraction 

When making our catalogs for FSCF study, we designed a source 
extraction routine geared to look for faint, compact sources and de- 
blend aggressively. In exchange for this increase in performance, 
we allowed for around 0.5% false sources that the more conserva- 
tive GOODS catalog lacks. 

We started with the catalog procedures used by the GOODS 
team and modified them to look for faint sources and pairs. In mak- 
ing their catalogs, the GOODS team used a modified version of the 
SExtractor (Bertin & Arnouts 1996) program called Object Extrac- 
tor (Oxtractor) that is designed to better extract faint sources near 
bright neighbors by modifying the noise floor in these areas. Ox- 
tractor also avoids including spurious noise in the area of a source 
as part of that source. We borrowed their code to produce our own 
catalog. 

We modified the GOODS team's procedures at several stages. 
Our most distinct change in method from the GOODS team was to 
use the V band (F606W)instead of the z band (F850LP). Using any 
reasonable SExtractor parameters designed to find faint sources, we 
find more GOODS sources in the V band. For our particular set of 
parameters, we found 56088 source in V band and only 29601 in 



© 2008 RAS, MNRAS 000, 1-15 



6 Morganson et al. 




Standard GOODS Catalog Custom Catalog 



WEIGHT.TYPE 


MAPjmS 


MAP.WEIGHT 


Filtering FWHM (Pixels) 


5.0 


1.5 


DETECT.THRESH 


0.6 


1.7 


DETECTJVONAREA 


16 


10 


DEBLENDJMTHRESH 


32 


16 


DEBLENDJVONCONT 


0.03 


0.03 



Table 2. The standard and custom GOODS SExtractor parameters. 



z. This suggests that many faint source are many blue star forming 
regions. We also changed the actual Oxtractor detection parame- 
ters to better find faint, small sources. Our changes are summarized 
in Table 2. One should note that the GOODS team used RMS im- 
ages (not publicly available) that are normalized differently from 
our weight images and that, accounting for this difference, our DE- 
TECT.THRESH is roughly equivalent to theirs. 

We arrived at these numbers by examining the correlation 
function of the 27 < V < 28 sources in uncorrelated simulation 
images, the number of detections in real images and the number of 
false counts in simulated images. To determine a filtering FWHM, 
we ran Oxtractor with different Gaussian filters of width between 
1 and 5 pixels (0.03" — 0.15") to produce catalogs from simulated 
uncorrelated data. The correlation function of these catalogs would 
ideally be zero for all values of 9, but we found that it became 
significantly negative at distances of roughly 3 times the FWHM. 
Running Oxtractor without filtering or while using only a small fil- 
ter causes the correlation function to become very large on scale 
less than 0.2". A FWHM of 1.5 pixels was the smallest we could 
use without introducing this small angle peak. 

We again used this correlation function to study the 
DEBLENDJvfTHRESH-DEBLEND.MINCONT parameter space. 
We qualitatively found the parameters which minimize spurious de- 
blending, manifested by a large correlation of what should be un- 
correlated data at separations of roughly 0.1 — 0.3", without hin- 
dering our effective resolution, manifested as a suppression of w{6) 
on scale of 0.2" — 0.5". Our final choice was identical to Benitez 
et al. (2003). 

We examined the 2 dimensional space of DE- 
TECT31INAREA and DETECT.THRESH using total counts 
from the real image and false counts in a simulated image. The 

PSF is roughly 3 pixels wide, so we centred our search around 
DETECT JVIINAREA « 9. We tried every integer value between 6 
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Figure 6. The magnitude distribution in the standard and custom GOODS 
South catalogs. The custom GOODS catalog contains nearly twice as many 
soiu-ces. 



and 20 pixels. Roughly speaking, we wanted to focus on aggressive 
5cT detections. For DETECTJVDNAREA = 9, a 5fT detection 
corresponds to DETECT.THRESH w 1.7. We tested every 0.1 
interval of DETECT.THRESH between 0.6 and 3. Our goal was 
to obtain the largest number of detections with 99.5% purity. 
Our final setting of 10 pixels at 1.7 yields an average of 60,000 
detections in a simulated GOODS South data set with 270 false 
detections. 

Our source extraction procedure detects only 51 negative 
sources in GOODS South and an average of about 270 false detec- 
tions in simulated images (as described in section 3). This catalog 
raises the total number of counts from 29601 to 56088 in GOODS 
South. In addition. Fig. 6 shows that we improved completeness 
of extended sources from roughly V = 26 to V= 27.5. To justify 
our previous claim that the V band is preferable for detecting many 
faint sources, we note that we detect only 29488 sources and 67 
negative sources on the z band with the above parameters. 



4.2 Masking 

Our source extraction methods fail in two areas of our images, so 
we masked these areas out separately. High noise near the edges of 
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our images produce a large number of correlated pairs. Improper 
background subtraction and real structure near bright sources pro- 
duced many suspect pairs. 

Near the edges of our images, effective exposure time drops 
off, and the background noise becomes large. These areas looked 
qualitatively different from the rest of the image and produced a 
disproportionate number of close pairs, so we removed them from 
our images. To do this, we convolved the weight images with a 30 
pixel (0.9") top hat and excised all areas in the original weight and 
science images where the weight in this convolved image was less 
than 40,000. These masked area amounted were concentrated al- 
most entirely near the edges of the images and amounted to roughly 
2% of the original images. 

Bright (y <21) sources provide two separate problems. First, 
errors in background subtraction from these sources affect signifi- 
cant area and are not accounted for in our source extraction proce- 
dures. Second, our aggressive deblending means that a small num- 
ber of bright, nearby sources with complex structure can be split 
into many faint pairs and significantly influence the overall correla- 
tion function. Wc arc only interested in studying faint sources that 
are not obviously associated with a bright source. To mask out these 
bright sources, we reject all sources contained in elUpses which 
are four times the area of the bright source. These masked areas 
amounted roughly 2% of the original images. 



4.3 Characterizing the Catalogs 

Our catalogs contain many faint sources that are small in extent 
and separation. The example sources in Fig. 7 range in magnitude 
between V = 26.31 and V = 27.49 and in diameter between 0.1" 
and 0.3". The separation lengths between detected sources range 
between 0.3" and 1.2". 

These are 3.5 — 2.5 magnitudes dimmer than those used in 
large scale studies like SDSS. Converting from SDSS r band to 
HST V band is an imprecise technique. We observe that sources 
have B — V of roughly 1 . 1 and make the rough conversion between 
detection bands via (Jester et al. 2005): 



z Absolute V Ai'cscconcI Linear Size (kpc) 



0.1 


-11.3 


1.8 


0.2 


-12.9 


3.3 


0.5 


-15.3 


6.1 


1 


-17.1 


8.0 


2 


-19.0 


8.5 


5 


-21.4 


6.4 



Table 3. The absolute magnitude of a source with apparent magnitude 27 (in 
an appropriately blueshifted V band) and physical distance corresponding 
to 1" at various redshifts.7. 



Sample Subsample Sources 10" pairs 1" pairs 



GOODS South 


25 


< 


V 


< 


26 


6266 


10653 


591 




26 


< 


V 


< 


27 


12282 


41049 


1514 




27 


< 


V 


< 


28 


18356 


92252 


1885 


GOODS North 


25 


< 


V 


< 


26 


6535 


11405 


626 




26 


< 


V 


< 


27 


12080 


36603 


1442 




27 


< 


V 


< 


28 


18016 


81140 


2015 


UDF 


27 


< 


V 


< 


28 


1339 


6154 


141 




28 


< 


V 


< 


29 


2111 


15394 


227 



Table 4. The number of sources, 10" pairs and 1" pairs for each 
subsample.7. 



5 CORRECTING MEASUREMENT ERROR IN THE 
CORRELATION FUNCTION 

For large angle correlation functions, nonuniform survey depth is 
usually the major threat of measurement error. After masking out 
the edges of the survey (see subsection 4.2) we have relatively 
uniform surveys and particularly very Uttle survey depth structure 
on the arcsecond scale. In addition, conspicuous causes of 'false 
sources' such as diffraction spikes and cosmic rays near the edge 
of the image where drizzling is not effective are again cleanly re- 
moved by this inasking. 

Proper deblending of distinct objects and improper deblend- 
ing of single source is our main source of measurement error. We 
examine this problem from two perspectives. First, we estimate a 
correlation function caused by imperfect measurement techniques 
using intrinsically unclustered simulated data. Secondly, we esti- 
mate our abiUty to measure actual clustering using simulated clus- 
tered data. 



r = V- 0.42(S -V)+ 0.11 w V - 0.36 



(20) 



Given the approximate nature of the spectra, this should be 
taken as only a rough conversion, but the SDSS cutoff magnitude 
of r = 22.5 is roughly equivalent to a GOODS/UDF V = 22.9 
cutoff, two magnitudes brighter than the dimmest sources we use. 

We also work on much smaller scales. Typical source separa- 
tion in SDSS were 100", resolution limits source size to 1.4" and 
statistics limited their source separation to 10". Our typical source 
separation is roughly 5", our resolution is 0.12" and we see pairs 
with 0.3" separation. 

We place these numbers in an astrophysical context in table 
3. For reference, M31 has an absolute magnitude of roughly V = 
—20 and the bulk of its luminosity is from a disk roughly 20 kpc 
across. 

Finally in Table 4, we note the number sources, 10" pairs and 
1" pairs. The crucial number is the number or 1" pairs. The Poisson 
noise of this number gives us a rough idea of how well we can 
measure our subarcsecond correlation function and shows that we 
cannot avoid at least a few percent error. 



5.1 Measurement-Induced Correlations in Uncorrected 
Data 

Our approach is an extension of the statistical method for evaluat- 
ing the correlation function introduced by Hamilton (1993). Specif- 
ically, to minimize the effects of nonuniform weighting and uncer- 
tainty in the zero point of w(6) we use a combination of two of 
Hamilton's estimates of the correlation function: 
< DD >< RR > 



l+West(,0) = 
1+West{0) = 



< RD >2 
<DD> 



(21) 
(22) 



<RR> 

Here, < DD > is the simple estimate of the data-data cor- 
relation function taken as a series of delta functions representing 
each pair separation. < RR > is the random-random correlation 
function of many (in our case 40) random fields emulating the ap- 
propriate dataset, and < RD > is the correlation of the appropri- 
ate data field with its corresponding random fields. We convolve 
< RR > and < RD > with a Gaussian filter with angular width 
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Figure 7. A typical singleton, pair and cluster of faint source. These sources are from the GOODS South field and range in magnitude between V = 26.3 (the 
second lowest source in the cluster) and V = 27.5 (the singleton). 





GOODS 


UDF 


25 < y < 26 


0.8 




26 < V < 27 


0.55 




27 < y < 28 


0.4 


0.5 


28 < y < 29 




0.3 



Table 5. 6>„ 



in GOODS and the UDF. 



O.l" to produce continuous functions. We normalize each function 
by a factor of roughly such that it converges to unity at large 
angles. 

The first method excels at reducing error due to survey nonuni- 
formity. But at small angles, resolution and source size reduce 
< DD > and < RR >, but not < RD >, and the esti- 
mate of ■w{6) is artificially damped. The second method does not 
counter survey nonuniformity as well as the first, but the damping 
in < DD > and < RR > cancel to first order to give a more 
accurate measurement at small angles. Therefore, we use the first 
method for 6 ^ 2" and the second method for 6 <2". 



l+West{9) 

1 + wm{9) 



<DD> 

l+WM{d) 

, for ^ > 2 

<RR> 

< RR >, for 61 < 2" 



(23) 



(24) 



where we have defined wm, the measurement-induced correlation 
function, for our own convenience. 

To first order, wm is the correlation function one would ob- 
serve from a naive < RR > measurement of uncorrelated sources 
due to survey geometry and improper deblending. This function 
can be positive if a single source is improperly deblended or if dim 
sources are enhanced by the tail of a bright source. It can be nega- 
tive if two sources are separated by less than their angular extent or 
the resolution of the instrument. In Fig. 8, we compare the observed 
data-data correlation function 1 + wo ~< DD > and 1 + wm 
(each convolved with a 0.1" filter), wm, is always smaller than 
Wo, but on small scales, it is generally within an order of magni- 
tude of Wm and must be handled intelligently to prevent significant 
systematic error. 

We use our estimation of the measurement correlation func- 
tion in Fig. 8 to choose Omin below which we do not study the 
correlation function. We define Omin as roughly 0.1" greater than 
the angle where wm approaches unity or where it becomes nega- 
tive. This corresponds roughly to the source size, below which our 
simple source profile assumptions should become important. 





GOODS 

K 


A 


UDF 

K 


A 


25 < y < 26 


1 


0.8 






26 < y < 27 


1 


0.3 






27 < y < 28 


0.64 


0.0 


1 


-0.4 


28 < y < 29 






0.89 


-1.2 



Table 6. A and k in used to reproduce input correlation functions in simu- 
lated data. 



5.2 Correcting Measurement Error 

For large scale correlation functions of bright sources, the naive ob- 
served correlation function, wo =< DD > —1, and the physical 
correlation function, wp w West, are simply related: 

1 -I- wo = (1 + wp){l + Wm) = l + wp + Wm + wpWm (25) 

But because we are measuring the correlation function on 
scales similar to the PSF and source size, we must adapt this 
method to account for these effects. We find that the suppression 
of the correlation on small scales is dependent on the amplitude 
of the correlation function and the magnitude of the sources being 
measured. We use a fitting model that accounts for these dependen- 
cies and is accurate well within our statistical error bars. 

We must relate the known wo and wm to an unknown w p . For 
small values of wp and wm, wo should equal wm+wp since there 
is only perturbative clustering and measurement error. To make a 
first order approximation when wp and wm are not both small we 
use the following model: 



l+Wo = l+Wp + Wm + \wpWM 



(26) 



For the standard correlation function measurement, A = 1 to 
make this a product. But we find that varying A is a convenient way 
to account for nonlinear effects of measurement error in crowded 
fields. The A term is only significant on small scales where both 
Wp and WM are large. 

In addition, we find that for incomplete samples, we imder- 
estimate the correlation function even at large separations. We in- 
troduce parameter k to correct for this effect in faint, incomplete 
samples (27 < y < 28 in GOODS and 28 < F < 29 in UDF): 



1 + wo = 1 + K.Wp + WM + \wpWM 



(27) 



Using the n and A parameters, we can reconstruct wp from 
Wo and wm using: 



1 -I- Wp = 1 + 



wq — Wm 
K-\- \ Wm 



(28) 
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1 10 

e (arcseconds) 

Figure 8. We compare wo, the naive observed correlation function in our data and wm, the correlation function induced on a random field of sources via 
spurious deblending, in the GOODS South field and the UDF. 



as shown in Fig. 9 for simulated clustered data. In these plots we 
show 0.02 + m to facilitate logarithmic plotting. We calculate wp 
using the known positions of sources in our simulated clustered 
images. The fits in Fig. 9 guided us to use this parameterization. 

Models which related higher powers of uip and wm to wo did 
not improve our ability to reconstruct the input correlation function 
significantly. Our simple A fit allowed us to find simulated Wp with 
precision much greater than the statistical fluctuations in a single 
GOODS or UDF measurement, so it is sufficient for our purposes. 
More precise models may be employed for future work. The essen- 



tial requirement for any such method is that close pair suppression 
vary with the amplitude of wp. 



6 THE ESTIMATED CORRELATION FUNCTION 

We employ a maximum likeUhood estimation technique to mea- 
sure the correlation function, w{d) without binning in 0. Mathe- 
matically, not binning is equivalent to binning very finely so that 
each in has either one or zero sources in it. If we assume a Poisson 
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Figure 9.wp (lines) and {wo—wm)/{,k + \wm) (datapoints) in simulated GOODS (left) and UDF(right). We obtain wphy finding the correlation function 
in the input catalogs we use to make our clustered simulated images. 



distribution in each bin, the likelihood of a particular realization 
would be: 



(29) 



where b is equal to the number of pair correlations within this angle 
bin (either or 1) and Hi is equal to the expected number of pairs 
in this bin: 

m = al{i+WM(ei)+{ei/eo)-'^ {K+\wM{ei)))A2T:ei sem 

where A is the area of the sample and 56 is the bin width, constant 
and small enough so that p << 1. 

Implicitly, we have some maximum and minimum 6 defined 
as Omax — Omin = Ti 59. Hcncc: 



logC-C-) = y^(-/Ji + 6ilog(Mi)) 

i=l 

n 

= 59 + 



(31) 



(32) 



i— 1 pairs 

J2 log((^o (l + i9./9or^iK + \wi9i))) 2-K9i) 

pairs 

In the continuous liniit. 



log(L) 



H{9)d9 + ^ log(/u(6l)) + constant (33) 



where we have abbreviated the sum involving only 69 as merely a 
constant which will be unimportant in the maximization process. 
We introduce a continuous version of /j.: 

H{9) = + wm{9) + {9/9q)-^ {k + \wm{9)))A 2n9 (34) 

We maximize L over ctq, F and wi = {9o)^. To find the error 
bars for each parameter, we perturb its value, maximize over the re- 
maining parameters and use the second derivative of this marginal- 
ize likelihood to estimate la errors. 



6.1 Fit Values 

We present our final values in the GOODS North field, the GOODS 
South field and the UDF in Table 7. The systematic error bars are 



25 < V < 26 

26 < y < 27 

27 < y < 28 

28 < y < 29 



w{l") 



0.889 ± 0.041 ± 0.088 
0.511 ±0.016 it 0.052 
0.298 ± 0.076 ± 0.030 
0.113 ±0.052 ±0.009 



0.955 ± 0.077 ± 0.038 
0.758 ± 0.027 ± 0.031 
0.598 ± 0.091 ± 0.024 
0.479 ± 0.035 ± 0.016 



Table 9. Best global estimates of w{l") and assuming F = 2.5. 



explained in subsection 6.2. The results are consistent. Particularly, 
our GOODS results for 27 < V < 28 agree with our UDF results 
for the same sources. 

Comparison with previous results is necessarily indirect. Vil- 
lumsen et al. (1997) produced the most comparable measurement, 
but used r band limits rather than V band ranges. As a point of 
reference, we see that our 25 < y < 26 measurement and their 
20 < r < 26 are within statistical error bars for 9 > l" . More 
broadly, we agree that ^(6*) « 0.1 for 9 > 1" for these faint 
sources. It is our probing down to w{9) for 9 < 1" that gives us a 
significant measurement. 

The GOODS North and South results are in good agreement. 
So we make a final estimate combining the two datasets to mini- 
mize statistical noise. We then have our best estimate of the corre- 
lation function in each of the four magnitude bins in Table 8. In the 
27 < y < 28 bin we use UDF data expressing our preference for 
statistical uncertainty over systematic uncertainty. 

These results are consistent within computed error of a F = 
2.5 in all cases. Assuming this value gives us slightly different val- 
ues of 9o in Table 9. The correlation length as a function of limiting 
magnitude V is well fit by: 



9o{V) = 10-°-'(^-^^-*'arcsec; 26 < V < 29 



(35) 



This is a factor of 40 larger than what we would expect from the 
extrapolation of a purely gravitational correlation function in equa- 
tion 4. 

We plot our results in Fig. 10 with the best F = 2.5 fit. We also 
plot the F = 0.7 fit to show that extending the SDSS power law to 
small scales vastly underestimates the number of close pairs. Note 
that we are plotting 0.02 + ■w{9) for GOODS and 0.1 -|- ■w{9) for 
UDF, because it allows for slightly negative points to be included in 



a log-log plot. Also note that the F = 0.7 fits are of 



+ 5, 



where 5 is an independently fit parameter. 5 is necessary, because 
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"'(!") 


9o 


r 


GN 25 < V < 26 


0.83 ±0.05 ±0.12 


0.93 ±0.10 ±0.04 


2.48 ± 0.37 


GS 25 < y < 26 


0.92 ±0.05 ±0.10 


0.97 ±0.11 ±0.04 


2.55 ±0.15 


GN 26 < y < 27 


0.536 ± 0.019 ± 0.053 


0.775 ± 0.043 ± 0.031 


2.45 ±0.17 


GS 26 < y < 27 


0.508 ±0.020 ±0.051 


0.759 ± 0.042 ± 0.031 


2.45 ±0.17 


GN 27 < y < 28 


0.36 ±0.015 ±0.10 


0.649 ± 0.035 ± 0.074 


2.38 ±0.14 


GS 27 < y < 28 


0.234 ±0.013 ±0.067 


0.549 ±0.033 ±0.074 


2.42 ±0.20 


UDF 27 < y < 28 


0.296 ±0.053 ±0.029 


0.60 ± 0.10 ± 0.02 


2.41 ±0.63 


UDF 28 < y < 29 


0.087 ±0.024 ±0.009 


0.438 ± 0.042 ± 0.016 


2.96 ±0.50 



Table 7. w{l"), 9o (in arcseconds) and T for GOODS North (GN), GOODS South (GS) and the UDF. 







00 


r 


25 < y < 26 


0.946 ± 0.034 ± 0.094 


0.979 ± 0.084 ± 0.038 


2.58 ±0.28 


26 < y < 27 


0.520 ±0.014 ±0.051 


0.763 ± 0.030 ± 0.031 


2.43 ± 0.12 


27 < y < 28 


0.296 ± 0.053 ± 0.028 


0.60 ±0.10 ±0.024 


2.41 ± 0.63 


28 < y < 29 


0.087 ± 0.024 ± 0.009 


0.438 ± 0.042 ± 0.016 


2.96 ± 0.50 



Table 8. Best global estimates ofwiii,Oo and r. 



different fit F's lead to different ao values and effectively offset 

w{e). 

The uo we use slightly undervalues the true density, because 
we require our fitted w{9) to always be positive. In a sense, our ao 
represents the density of sources if there were no 'extra pairs' due to 
the correlation function. Given this expected discrepancy, our fitted 
results are consistent with being slightly less than risources/Area 
and we do not print them here. 



6.2 Systematic Error 

Our methods produce systematic errors related to how our simula- 
tions differ ff'om true images. We trace these error to two effects: 
unrepresentative source profiles and inadequate noise models. The 
effects manifest themselves as errors in wm and wo - If our simu- 
lations are accurate, however, the effects should cancel out in our 
final measurement of wp. But any discrepancies between our sim- 
ulations and real images will prevent this cancellation and produce 
systematic errors in our measured wp. We set generous upper lim- 
its on discrepancies in our estimate of systematic error and find that 
statistical error is still our major source of noise in Oo- 

Object Extractor, like Source Extractor, subtracts a back- 
ground profile from each source so that the faint (below detec- 
tion threshold) wings of each source do not make the surrounding 
sources appear brighter. But for both the real and simulated images, 
this process is imperfect and background subtraction may cause any 
source in the region surrounding a given source to be recorded as 
brighter or dimmer than it truly is. 

Source profiles are generally larger than PSFs, and we were 
able to reduce our error in PSF width to at most 2%, so we focus 
on the source profiles. In subsection 3.1, we noted that the inten- 
sity of a typical source is at most 0.05 DETECT.THRESH at the 
distances at which wc search for pairs. This sets an upper limit in 
the difference between observed and actual luminosity of roughly 
\AL/L\ = 0.05. 

Source concentration is proportional to: 

a{L) oc e'"'^^' (36) 

and an uncertainty in L will lead to a localized uncertainty in cr. 
This localized uncertainty in cr will produce or suppress pairs, di- 
rectly altering ■w{6) at the scale of the improper backgroimd sub- 
traction. 



Sample 








25 < y < 26 


0.038 





0.038 


26 < y < 27 


0.031 





0.031 


27 < y < 28 GOODS 


0.022 


0.070 


0.074 


27 < y < 28 UDF 


0.024 





0.024 


28 < y < 29 


0.011 


0.011 


0.016 



Table 10. Systematic errors due to background subtraction. 



Applying 0.05 fractional imcertainty in luminosity at the Oo 
scale leads to a background subtraction imcertainty, cTbs, in do of 
roughly: 



da{L) dOo 
' cr(L) dL dw 



{e = Bo) 



(37) 



abs = 0.05 (^l+u;(6io) 
_ 0.05 x2xri ^ 

~ iog(io"-4)r 

The effect of false detections is twofold. If the detections were 
randomly scattered throughout the image, the extra 0.5% sources 
would reduce the correlation function on all scales by 0.5%. Any 
clustering of these sources could contribute to a positive correlation 
function. But we do not observe strong clustering of false detec- 
tions near sources in subsection 3.6 on the scales we probe here. 
We estimate that false detections are at most twice as likely in the 
area 9 < 29o. This implies that only about 0.1% of sources would 
have an extra partner on these scales. In subsection 6.3, we find that 
roughly 10% of sources have a partner on these scales, so false de- 
tections are only a percent level source of error. We neglect their 
contribution in our systematic error estimate. 

A separate source of significant error derives from the use of 
the incompleteness factor, ft. We use this factor in our GOODS 
27 < y < 28 and UDF 28 < y < 29 measurements to counter 
the effects of incompleteness. The values we use reproduce the in- 
put correlation fimction in our simulation measurements, but we 
do not understand exactly how incompleteness affects w{9). The 
farther k is from its ideal value of unity, the more uncertainty our 
use of K implies. We assign a fractional uncertainty in k equal to 
0.5(1 — k) to produces generous error bars in our measurements of 
9o in incomplete samples. This leads to an imcertainty a^. in Oo of: 



0.5(1 -k) ^ 



(38) 



We do not apply systematic error to F, because F is highly 
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Figure 10. The fitted correlation function in GOODS 25 < F < 26 (top, left), GOODS 26 < V < 27 (top, right), and UDF 27 < V < 28 (bottom, left) 
and UDF 28 < V < 29 (bottom, right). We show our best T = 2.5 and T = 0.7 fits. w{e) is offset from zero by 0.02 (0.1) for GOODS (UDF) so that 
slightly negative points can be plotted on a log-log plot. This causes distortion in the plots at large angles. The F = 0.7 fits are actually {0/6o)~°''^ + S fits. 



dependent on the small number of close pairs, and random errors 
dominate systematic errors. 

In this paper, we do not address the issue of cosmic variance. 
While each field is many times larger than our critical distance of 
roughly one arcsecond, cosmic source densities can vary on large 
scales and the effect of such variation on small scale clustering is 
unclear. It is likely that different clusters of sources are separated 
by cosmological distances along the line of sight. So by taking an 
angular measurement, we may be averaging out cosmic variance. 
There is no statistically significant variation in T or source den- 
sity (Jo between GOODS North and GOODS South, and we have 
no reason to believe that cosmic variance is a significant effect. 

6.3 Multiplicity Fractions 

Measuring the fraction of sources in close pairs is another way to 
study clustering. It allows us to compare the correlation length with 
the average separation of sources. These fractions also give us de- 
tailed estimates of how many pairs we have in the sky. 

In Fig. 11 we see Fi{9), the fraction of sources with one 
neighbor within 6 of them. In an unclustered sample, we would 
have: 

Fiu{0) = 1 - e"""°(*'-*™-' (39) 
While in a clustered sample, we have: 



Sample 


(TO degree ^ 


00 


NpairsC^Oo) degree" 2 


25 < y < 26 


1.4 X 10^ 


1.06" 


1.6 X 10^ 


26 < y < 27 


2.8 X 10^ 


0.880" 


4.3 X 10* 


27 < y < 28 


5.0 X 10^ 


0.64" 


1.1 X 10^ 


28 < y < 29 


7.8 X IQS 


0.493" 


1.3 X 10^ 



Table 11. Integrated source and parr counts (per square degree) in our cata- 
logs. The y < 27 sources are taken from GOODS South and the y > 27 
soitfces are taken from UDF. 



Fic(6>) = 1 - 6"'"'""''''"*™''''+^°/^^"^'^*'"''"''^"-" (40) 

In our samples, the average separation is 3" < (ttcto)"^^^ < 
8". On scales smaller than this, Fi would go as 6^ if the samples 
were unclustered. Instead, in Fig. 1 1 we see a steep initial rise with 
many close pairs and then a flattening out at 6o as w(6) ceases to 
dominate. 

Finally, the F\ function allows us to estimate the number of 
pairs that we see on the sky. In Table 11, we see a roughly expo- 
nentially increasing number of pairs within 2^o as we go to fainter 
magnitudes which cuts off at V « 28. The failure to find more 
high magnitude pairs could be due to the fact that the V > 28 pairs 
would be on scales very near the resolution limit of the instrument. 
In any event, if we are to use the FSCF with roughly HST-like space 
telescopes, we must probe faint sources y > 25 to get good statis- 
tics. 
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9 (arcseconds) 

Figure 11. -Fi (0), the fraction of sources with a partner within 6 (left). The 
V < 27 sources are taken from GOODS South and the V > 27 sources 
are taken from UDF. 

The integral counts are roughly consistent with: 
N{V) = ]^o"'**^"^^'^^""'"'^'^"^"'''^'^"'°'^^ (41) 
This formula includes incompleteness in our datasets. 



7 DISCUSSION 

In this paper we have presented measurements of the two point an- 
gular correlation function for faint (25 < V < 29) sources. This 
measurement has been validated by extensive numerical simula- 
tion. The observed correlation function is consistent with: 

wie) = ; OoiV) = 10-°-i(^-^^ *^'arcsec (42) 

This measurement shows that the FSCF has a much steeper 
slope and larger normalization than the SDSS angular correlation 
function for LRGs would suggest if extrapolated. This is not sur- 
prising since we are looking at smaller scale physics, at bluer 
sources and likely at different redshifts. This measurement is not 
a direct extension of the LRG work, but instead an analogous mea- 
surement for a different dataset. 

There are many possible uses for this measurement. First, our 
observation of this galactic scale correlation function can be com- 
pared with numerical simulations of galaxy formation that include 
gravitational clustering, gas dynamics, star formation, etc. In this 
application it is important that the simulation data be processed in 
the same way as the observation. Alternately, it is possible to rean- 
alyze this data with a method that derives from the simulation. 

Our measurement errors are limited to roughly 10% by the 
data. They could be reduced substantially with larger samples as 
has happened with SDSS and the bright source w{6). It is un- 
likely that significantly more deep field type data can be mined 
from the existing HST instruments. However the Wide Field Cam- 
era 3 should be deployed this year and will provide 7 square ar- 
cminute exposures of high resolution data with similar limiting 
magnitudes to the Advanced Camera for Surveys (Kimble et al. 
2006). Looking ahead, the James Webb Space Telescope (JWST) 
is scheduled to launch in 2013 and will provide 5 square arcminute 
frames with 0.1" resolution and a limiting magnitude of at least 
K ^ 25 (V ^ 30 for galactic sources) (Gardner et al. 2006). 
JWST should find many high redshift galaxies. In Table 12, we see 



that if we could study the FSCF across many fields to obtain several 
square degrees of data, we could vastly improve our measurements 

oiw{e) 

Table 12 also shows that several ground-based projects will 
provide enough data to overcome the small amplitude of the FSCF 
on arcsecond scales and yield significant measurements. The Dark 
Energy Survey (The Dark Energy Survey Collaboration 2005) will 
produce a precise measurement for the V < 24 sources that we 
do not study here. The Panoramic Survey Telescope and Rapid Re- 
sponse System (Pan-STARRS) (Jedicke et al. 2007) and the Large 
Synoptic Survey Telescope (LSST)(lvezic et al. 2008) could probe 
30,000 deg^ and 20,000 deg^ respectively and measure the FSCF 
of the brighter sources than those we study here with precision of 
« 10-*. 

But the superior resolution and huge area of possible fu- 
ture space-based surveys makes them the ideal candidates for this 
method. The Supernovae Acceleration Probe (SNAP) would mea- 
sure roughly 1,000 deg^ with a limiting magnitude of approx- 
imately 28 (SNAP Collaboration 2005). Destiny has a similar 
project goal (Benford & Lauer 2006). These enormous surveys 
would yield FSCF results that directly study V < 28 sources on 
the pertinent subarcsecond scales, reducing the statistical noise we 
encounter here by a factor of several hundred. 

To learn more about these faint sources, we must know their 
distance. The sources we discuss in this paper are roughly 5 magni- 
tudes too dim for spectroscopy, 3 magnitudes too dim for traditional 
weak lensing measurements and roughly 2 magnitudes beyond the 
range where one can rely on training sets to produce accurate pho- 
tometric redshifts. 

Fortunately, lensing enables a new approach to measuring the 
distance to these sources. Consider first, a field of sources that is 
gravitationally lensed by large scale structure. The density of a 
given population of sources will vary inversely with the magnifi- 
cation, n. But amplification also brings faint sources above the de- 
tection threshold, and if the number of sources near the detection 
threshold goes as I/"'', then the density of total sources goes as 
In practice /3 « 0.8 and the effects nearly cancel. This and 
the fact that amplification on cosmic scales is only a few percent 
make this measurement exceedingly difficult. 

But measuring the effect of shear on the FSCF is relatively 
straight forward and achievable using the large datasets mentioned 
above. A uniform shear, 7 breaks the azimuthal symmetry of the 
FSCF and, for a power law, the FSCF becomes: 

w{e,4>) = w{e){l + jT cos{24>)) (43) 

where 'w{9) is the unsheared FSCF, and 4> is the angle between the 
source-source separation vector and the axis of shear. 

If we have high resolution data with a limiting magnitude of 
roughly 28, we expect 10^ close pairs per square degree. With a 
1,000 square degree field, we would be statistically limited to mea- 
suring to measuring w{9) at the 10~'* level of precision. A 7 of 
0.01 could in turn be measured with roughly percent precision. 
Instead of using this method to measure 7, however, we will use 
the superior measurements of shear gained from traditional weak 
lensing of brighter sources with calibrated photometric redshifts to 
determine the redshift distribution of the faintest sources in the sky. 

Cosmic shear is in many ways the most difficult type of lens- 
ing measurement to make, and it is likely that the 'Pair Lensing' we 
describe here will follow a similar observational path to traditional 
weak lensing, first being observed around clusters and galaxies and 
then being observed in large field. 
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Datasct 


Area ((Icy- ) 


Limiting V Magnitude 








'^":fniTit)/ "'faint 


HST-WFC3 


5 


27 


0.5" 


3 X 10" 


2.8 


3 X 10"^ 


JWST 


5 


30 


0.5" 


4 X 10'' 


0.50 


9 X 10"^ 


DES 


5000 


24 


1.5" 


2 X 10* 


1.0 


1 X 10-3 


Pan-STARRS 


30000 


26.5 


1.5" 


1 X lOi'' 


0.24 


2 X 10-4 


LSST 


30000 


27.5 


1.5" 


2 X lO^'' 


0.14 


1 X 10-"' 


SNAP 


1000 


28 


0.4" 


1 X 10^ 


2.8 


2 X lO-"' 



Table 12. Characteristics of upcoming datasets and their potential to measure w(6). The area and limiting magnitude of the HST-WFC3 and JWST are 
estimated, and these datasets will not be contiguous or uniform surveys. For JWST, we estimate the limiting V magnitude given a K ^ 25 hmit and galactic 
sources. 9rnin is an estimated minimum 6 at which we could reasonably make w(6) measurements and is equal to roughly three times the PSF width. 
w faint (6min ) is the size of the correlation function in the faintest single magnitude bin assuming Eq. 42 holds. (Tw faint faint is the fractional statistical 
uncertainty in this measurement assuming we bin pairs with drnin <0 < 1.58 dmin- 



We plan to explore the FSCF in more detail in future papers. In 
paper II of this series we will compute the three point correlation 
function for these faint sources. In paper III we will discuss the 
theory of gravitational lensing on the FSCF in more details and 
apply our results to the three environments mentioned above. In 
paper IV we will use existing data to attempt to observe this lensing 
phenomenon and examine in depth the possibility of making a more 
precise measurement in a larger dataset. 
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